Racial and ethnic disparities in workers’ compensation claims rates

Background Workers of color experience a disproportionate share of work-related injuries and illnesses (WRII), however, most workers’ compensation systems do not collect race and ethnicity information, making it difficult to monitor trends over time, or to investigate specific policies and procedures that maintain or could eliminate the unequal burden of WRII for workers of color. The purpose of this study is to apply a Bayesian method to Washington workers’ compensation claims data to identify racial and ethnic disparities of WRII by industry and occupation, improving upon existing surveillance limitations. Measuring differences in risk for WRII will better inform prevention efforts and target prevention to those at increased risk. Methods To estimate WRII by race/ethnicity, we applied the Bayesian Improved Surname Geocode (BISG) method to surname and residential address data among all Washington workers’ compensation claims filed for injuries in 2013–2017. We then compare worker and injury characteristics by imputed race/ethnicity, and estimate rates of WRII by imputed race/ethnicity within industry and occupation. Results Black/African Americans had the highest rates of WRII claims across all industry and occupational sectors. Hispanic/Latino WRII claimants also had higher rates than Whites and Asian/Pacific Islanders in almost all industry and occupational sectors. For accepted claims with both medical and non-medical compensation, Bodily reaction/overexertion injuries accounted for almost half of the claims during this reporting period. Discussion The high rates of injury we report by racial/ethnic categories is a cause for major concern. Nearly all industry and occupation-specific rates of workers’ compensation claims are higher for Black/African American and Hispanic/Latino workers compared to Whites. More work is needed to identify work-related, systemic, and individual characteristics.


Introduction
Rates of non-fatal WRII have been reported to vary by race and ethnicity, although estimates based on large, representative datasets are rare. Instead, evidence that workers of color bear a disproportionate share of WRII often come from two types of studies. The first are narrow in scope by industry, injury outcome, geography, or time period [1][2][3], limiting their ability to monitor trends over time or to make comparisons across worker groups. In the second type, evidence of disparities is based on group-level comparisons rather than case-level data. Injury data by industry and occupation are compared to employment data by race/ethnicity, leading researchers to conclude that non-white workers are at higher risk of workplace injuries because they are more likely to be employed in industries and occupations with high rates of WRII. In this second approach, differences in WRII by race/ethnicity within industries cannot be measured [4,5]. A better understanding of existing disparities in WRII by race and ethnicity within industries and occupations is crucial to developing and targeting injury and illness prevention efforts and advancing health equity [6,7].
Most large-scale data sources routinely used for monitoring nonfatal WRII are insufficient for estimating risks by worker race/ethnicity [8]. The Bureau of Labor Statistics annual Survey of Occupational Injuries and Illnesses (SOII) collect information on race/ethnicity for cases involving days of missed work, but the race/ethnicity data are missing for over 40% of cases [9]. The Behavioral Risk Factor Surveillance System (BRFSS) can be used for state estimates of WRII by race/ethnicity when states include an optional work injury module in the survey, but single year sample sizes are not sufficient for producing estimates by race/ethnicity in combination with other factors including industry, so multiple years must be combined, which means that timely surveillance of rates of injury by race/ethnicity can't be accomplished with BRFSS [10]. The nationally representative population-based National Health Interview Study (NHIS) utilizes a large sample size, and has been used to assess differences in WRII by race/ ethnicity [11,12], but as with BRFSS, estimates of WRII are dependent on a work supplement included sporadically. Medical care data such as hospital discharge or emergency department data may capture race/ethnicity-often as perceived by the health care provider rather than as reported by the patient-but these sources generally lack information on the injured worker's industry or occupation.
When self-reported race/ethnicity is not available, the Bayesian Improved Surname Geocode method (BISG) developed by the RAND Corporation, can be used to estimate race/ethnicity based on last name and residential address data [13,14]. Using the racial and ethnic population distributions within census block groups in combination with proportions by last names listed in the Census Surname File, BISG is used to calculate a probability of belonging to the following six mutually exclusive categories: (1) White, (2) Black/African American, (3) American Indian/ Alaska Native, (4) Asian/Pacific Islander, (5) "More than one race," and (6) Hispanic/Latino (all races).
Workers' compensation claims data is an administrative dataset that usually lacks selfreported race/ethnicity, but does include worker name and address data-the information needed to estimate race/ethnicity using BISG. Washington State's workers' compensation data include additional information detailing the injury event and severity, the industry of the injured worker's employer, and the injured worker's demographic characteristics. The purpose of this study is to apply BISG to Washington workers' compensation claims data to identify racial and ethnic disparities of WRII by industry and occupation, improving upon the surveillance limitations identified earlier. Measuring differences in risk for WRII will better inform prevention efforts and target prevention to those at increased risk.

Materials and methods
Data used in this study come from the 1) The Washington State Department of Labor & Industries, 2) The American Community Survey, and 3) the Census Surname List, and 4) the Current Population Survey (CPS), via the Centers for Disease Control Employed Labor Force (ELF) online query, all of which are described below.

Washington's workers' compensation system
All Washington State employers are required to obtain workers' compensation insurance unless workers are covered by an alternative workers' compensation system (e.g., the federal government, employers of railroad, or long-shore workers), or are specifically exempted in Washington statute, such as the self-employed [ Claims are classified as either rejected, the claim did not meet the requirements for the worker to receive medical or time-loss compensation; accepted for medical aid only, the worker received benefits for medical costs only; or accepted for both medical and non-medical costs-including time-loss compensation, permanent disability awards, survivors' benefits, funeral expenses, and/or pension benefits. Accepted claims are assigned Occupational Injury and Illness Coding System (OIICS) v.1 codes [17] based on information on the Report of Industrial Injury or Occupational Disease/Report of Accident (ROA), a form filed by the health care provider and injured or ill worker to initiate a workers' compensation claim. These claims are assigned an OIICS code for the nature of the injury, source of injury, body part affected, and type of event (or exposure) that led to the injury. We aggregated results by the 1-digit OIICS event code [17].

Workers' compensation claim data collection
Washington State Fund and self-insured workers' compensation claims were included in this study. The data were abstracted from the system on July 23, 2019, and included claims with an injury date from 2013 through 2017 (n = 1,004,068) to ensure alignment with the ACS residential address data used. We extracted claimant surname and residential address for the BISG, and additional key administrative variables including the following: age, gender (binary male/ female), OIICS injury event (injury type), claimant language preference, National Occupational Research Agenda (NORA) [18] industry sector, Standard Occupation Classification (SOC) major group, and claim adjudication status (rejected or accepted).

Census data
The American Community Survey is a product of the U.S. Census Bureau, and replaces the long form decennial census. The ACS collects detailed information about population and housing characteristics as well as economic, social and demographic data. For this study, the racial ethnic distributions by Census Block Group were downloaded from the American Community Survey (ACS) 2017 5-year summary file for Washington State (2013-2017) [19]. We obtained race/ethnicity proportions for the six mutually exclusive categories above by surname from the 2010 Decennial Census Surname list [20] (Census, 2016). Surnames must have been reported, along with self-reported race/ethnicity, at least 100 times in the 2010 Census to be included in the Census Surname list.
Denominator data for rate calculations were obtained from the Current Population Survey (CPS), using the Centers for Disease Control and Prevention (CDC), National Institutes for Occupational Safety and Health (NIOSH), Employed Labor Force (ELF), query system [21] to estimate full time equivalent workers (FTE) within each industry sector during 2013 through 2017.

Bayesian Improved Surname Geocoding (BISG) method
This study used the Bayesian Improved Surname Geocode (BISG) method developed by Elliott et al. [14] and detailed elsewhere [22] to estimate race/ethnicity among Washington workers' compensation claimants. Briefly, the BISG method uses racial/ethnic proportions by geographical area of residence, and surname from the U.S. Bureau of the Census (Census) to calculate posterior probabilities of belonging to six mutually exclusive racial/ethnic categories. Each record is assigned the posterior probabilities of being (1) White, (2) Black/African American, (3) American Indian/ Alaska Native, (4) Asian/Pacific Islander, (5) "More than one race," and (6) Hispanic. The posterior probabilities for each record are then summed across the six categories to estimate the race/ethnicity distribution in the sample. The BISG method is not designed to assign race/ethnicity to individual records [14]. While the BISG estimates probabilities for American Indian/Alaska Native and "Multiracial," prior studies demonstrate that the BISG produces less robust estimates for these racial groups [14,22,23], therefore results for these two racial/ethnic groups will not be reported.
The general equation for calculating Bayes' theorem is calculated as the probability of A given B (prior), which is the probability of B given A, multiplied by the probability of A. This is divided by the probability of B or written as:

PðBjAÞPðAÞ PðBÞ
For each claimant, we estimate the probability of being Black given the proportion of residents in their block group who identify as Black, multiplied by the proportion of Americans with their surname who identify as Black. This is done for each additional race/ethnicity category: (1) White, (2) Black/African American, (3) American Indian/ Alaska Native, (4) Asian/ Pacific Islander, (5) "More than one race," and (6) Hispanic.

Analysis
Workers' compensation claimant surnames were matched to the Census Surname List to obtain proportions of race/ethnicity by surname. Claims with surnames that did not match the Census Surname List were excluded from analyses (n = 92,268, 9.2%). The Washington Master Addressing Services (WAMAS) in the Washington State Office of the Chief Information Officer, standardized addresses to the USPS standard format and geocoded the claimant addresses using ArcGIS for this project [24]. Claimants with no residential address information, who did not live in Washington State, or had a P.O. Box or rural road address were excluded from analyses (n = 92,466, 9.2%). Elliott and colleagues do not recommend using P.O. Boxes or rural road addresses for the BISG method because it reduces accuracy of the estimation [14]. All remaining claimants were then assigned a 12-digit Federal Information Processing System (FIPS) code for Census Block Group. Claimants were matched to the ACS dataset with proportions of race/ethnicity groups by Census Block Group.
Using Bayes' Theorem, we then calculated the posterior probabilities of a claimant being the following four race/ethnicity categories (1) White, (2) Black/African American, (3) Asian/ Pacific Islander, and (4) Hispanic/Latino for each claim remaining in the sample (n = 819,234). The posterior probabilities were then summed for the race/ethnicity categories by the administrative variables listed above to compare characteristics of filed claims by imputed race/ethnicity (e.g., age, gender, NORA industry sector, SOC occupation major group, etc.).
Workers' compensation claim rates were then calculated by race/ethnicity NORA industry sector, and SOC major groups (the 22 SOC major groups were combined in our analyses due to small numbers in some groups). Numerators were calculated by multiplying the summed posterior probabilities by the total number of claims in each industry sector. Denominator data for rate calculations were obtained from the CPS, using the Centers for Disease Control and Prevention (CDC), National Institutes for Occupational Safety and Health (NIOSH), Employed Labor Force (ELF), query system [21] to estimate full time equivalent workers (FTE) within each occupation and industry sector during 2013 through 2017. Claim incidence rates are presented per 1,000 FTE, and 95% Confidence Intervals are presented with claim rates. The confidence intervals were calculated using the delta method [25] (Klein 1953) to derive variance.
The Washington State Department of Labor & Industries is a public health authority, and as such is allowed by 45 CFR 46.102(/)(2) to conduct public health surveillance to identify, monitor, assess or investigate conditions of public health importance. The work presented here, falls within public health surveillance activities as covered by the Common Rule, and is not required to be approved by an institutional review board; therefore, no institutional review board approval was sought. A minimal dataset is available in the Supporting Information files. Access to additional confidential data can be requested by contacting  Table 1 describes the claims studied, by estimated race/ethnicity. Hispanic/Latinos had the lowest percentage of rejected claims, while Asian/Pacific Islanders and Black/African Americans had the highest. Hispanic/Latinos had the highest proportion of claims accepted for medical aid only, and the lowest proportion of accepted for medical and non-medical compensation. There were only minor differences by gender and age. Hispanic/Latinos were substantially less likely to prefer communication in English (55% versus 81-99%). The most common injury types were bodily reaction and overexertion injuries, accounting for about 1/2 of all claims accepted for medical and non-medical compensation injury types during this period. Hispanic/Latino's had the lowest proportion of injuries classified as bodily reaction and overexertion (41.8%), and Whites had the highest (51.4%). Table 2 presents rates of injury among accepted claims by race/ethnicity within each NORA industry sector, as well as rate ratios comparing all other racial/ethnic groups to Whites. Rates for all non-White groups were significantly different than rates among Whites for all industry sectors. Black/African Americans had considerably higher rates of accepted claims compared to Whites in every NORA sector ranging from a fourteen-fold higher rate in Agriculture, Forestry, and Fishing to a 2.3 to 8.4 fold higher rate in all other sectors. Rate ratios for Hispanic/Latino accepted claims compared to Whites were also higher in every NORA     . Table 3 provides rates by two-digit 2010 Standard Occupational Classification (cross walked from 2010 Bureau of Census (BOC) codes). Table 3 rate ratios are similar to the industry level rate ratios presented in Table 2, with Black/African Americans having the highest rates and rate ratios (compared to Whites) across all occupational groups, followed by Hispanic/Latino workers. Overall, Asian/Pacific Islanders had lower rates by occupation than Whites.

PLOS ONE
Comparing claims included in the analysis (those with estimated race/ethnicity) and claims excluded for not having either surname from the Census surname list, or and address that was not block group codeable, there were differences (at least a 5% difference between study data and data missing) for claim, and injury type category. Differences were also found by claim type, with fewer claims accepted for medical aid only missed due to residential address not being codeable, compared to missing surnames. However, larger differences by claim type, language preference, NORA Sector, and SOC occupations were found, results are presented in Table 4. Among the differences, we found a lower percent of Spanish language claimants' last names were not on the Census Surname list, (4.5% compared to those that were BISG coded 9.9%). Almost 12% of workers in the Agriculture, Forestry, and Fishing NORA sector were not included in the data set due to missing residential address data, and 18% in the Services SOC occupation group were not included, again due to not-codeable address.

Discussion
To our knowledge, this is the first population-based study to report workers' compensation claim rates by race/ethnicity within industry sector and occupation groups. Previous studies of occupational injury disparities found that a greater proportion of workers of color are employed in high hazard industries compared with white workers, suggesting that workers of color are at greater risk of workplace injury because of employment patterns [3][4][5]. To arrive at their conclusions, researchers combined occupational injury data by industry from one source with race/ethnicity data by industry from another source, linked at the level of industry and not the individual worker. This approach requires one to assume that injury risk is equal across all workers within an industry or occupation. Using BISG, we found that risk of WRII was not equal across workers in the same industry sector or occupational group. Within each industry sector, rates of WRII varied by race/ethnicity, with Black/African American workers experiencing the highest rates, and Asian/Pacific Islanders generally experiencing the lowest rates. Additionally, the magnitude of the difference in rates across race/ethnicity differed by industry and occupation, with the greatest differences in rates by race/ethnicity observed among the highest risk industries and occupations, notably Agriculture, Forestry, and Fishing. The ultimate goal for occupational safety and health professionals is to reduce all work-related injuries to zero, and while we have come a long way since the inception of OSHA, this goal is still far from a reality. That Black/African American and Hispanic/Latino workers have  historically been over-represented [3,26] in work-related fatal and non-fatal injuries regardless of industry or occupation must be more than a passing description of results, it must become significant focus for all public health professionals aiming for health equity. Racism in housing, education, hiring and occupational mobility have all been shown to reduce the life chances of Black and Hispanic/Latino populations in America [4,27,28], and these issues combine to determine which jobs and occupations a person has access to. Limited by educational opportunities, residential segregation, discrimination in hiring and lack of mobility, workers of color are more likely sorted into high hazard jobs [27,29,30], and this study found that even within occupation and industry, rates of injury differ significantly by race and ethnicity. These multiple forms of harm to historically marginalized racial and ethnic groups in the United States largely determine our individual and societal health, income, wealth and future generational realities, it is imperative we disrupt these overlapping vulnerabilities if (workplace) health equity is our goal.
While this study adds significantly to the literature, by addressing industry sector, occupation, and race/ethnicity disparities in injury rates, it does have some notable limitations. First, the racial/ethnic probabilities were indirectly derived by using a Bayesian statistical method, not self-reported. While the BISG has been shown to be robust method for identifying racial/ ethnic group membership [14,22], it is likely not as precise as self-report. In addition, due to limitations of the BISG, we were not able to report American Indian/Alaska Natives, or More than One Race injury rates. Third, the BISG provides simultaneous probabilities for very general races and Hispanic/Latino ethnicity, but cannot distinguish between different sub groups such as Chinese or Cambodian, Peruvian or Puerto Rican, or distinct sovereign nation affiliations, nor can it distinguish between native and foreign-born workers. Finally, the missing address and surname data that resulted in many workers' compensation claims being dropped from analysis, appears to have reduced the number of Spanish speakers in this study, which most likely underestimates the rates and rate ratios for Hispanics/Latinos. Future research should address these limitations and potentially add more information such as first name [31] and other data such as language spoken, which could improve the BISG estimates for the broad racial and Hispanic/Latino groups. Claim filing behavior is also another important limitation, Fan et al. [32] used data from the Washington State Behavioral Risk Factor Surveillance System, and found that, with the exception of American Indians, non-White workers were less likely to file a workers' compensation claim, although the difference was not found to be statistically significant; differences between self-reported injuries and filing a workers' compensation claim varied considerably by occupation, with farming/forestry/fishing ranked highest in reporting a work-related injury or illness, but second lowest in filing a workers' compensation claim. If workers of color are less likely than White workers to file a workers' compensation claim, disparities in WRII by race and ethnicity are likely greater than estimated here.
Finally, it is possible that the differences in rates of WRII by race/ethnicity may still be due to segregation into high hazard industries and occupations, obscured by aggregation into the broad categories of industry and occupation used in this study. Estimates of injury rates for more detailed industry groupings would require population estimates of workers by race/ethnicity and industry based on sample sizes larger than the current CPS or some other robust source of employment data by industry and race/ethnicity (ACS involves a larger sample but produces estimates of employed persons and not FTE).
These limitations are noteworthy, but considering the lack of alternatives in the field of occupational health to quantify work injury disparities by race/ethnicity, industry, and occupation, this study still contributes a great deal to the ongoing discussion of racial/ethnic disparities in occupational safety and health. The high rates of injury we report by racial/ethnic categories is a cause for major concern. The working population in Washington state during this data period was 73% White, 3.3% Black/African American, 1.2% American Indian/Alaska Native, 10.2% Asian/Pacific Islander, 12.2% Hispanic/Latino, while the rates per 1,000 FTE are higher for all non-White racial and Hispanic/Latino ethnicity, compared to Whites. More work is needed to identify work-related, systemic, and individual characteristics. We need to improve our understanding of what creates these inequities to add to the discussion of reducing the disparities of work-related injuries and illnesses. Adding self-reported race and ethnicity to the workers' compensation system would be an ideal way estimate racial and ethnic inequities, however even if we were to implement this today, it would take years to have enough self-reported data to address some of the major findings in this study. Adding selfreported race and ethnicity to workers' compensation systems would in the long run, allow us to identify more nuanced ethnic inequities that could lead to better targeting of interventions.

Conclusion
To our knowledge, no workers' compensation system in the United States collects data on the race and ethnicity of claimants, which makes it difficult to identify disparities in injury risk, health service access, utilization, and outcomes by these critical social constructs. Enumerating inequities in workers' compensation claim rates is foundational to identifying and addressing root causes of disparities in WRII. Utilizing a well-tested method for indirect estimation of race and ethnicity, we found that risk of WRII is not similar within an industry or occupation, and that certain industries and occupations have larger racial/ethnic disparities in who is at risk for injury. Understanding injury risk by race/ethnicity can better allocate resources for prevention, elicit new lines of research and provide researchers and policymakers with much needed knowledge of how racism might be affecting workplace safety and workers' compensation insurance programs. It is the first step toward identifying policies, procedures and laws that need to be dismantled, re-imagined, or created so that all workers, regardless of race and ethnicity, come home safe and healthy from work each day.